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1  Abstract 


In  the  eighth  quarter  of  the  work  effort,  we  focused  on  a)  conducting  experiments  on  real-world 
data  sets  using  the  developed  algorithms,  b)  continued  design/implementation  of  the  Multiscale 
Singular  Value  Decomposition  (SVD)  algorithm  and  c)  packaging  for  releasing  the  software  as 
open  source.  This  report  documents  experimental  results  with  the  Multiscale  SVD  algorithms. 

The  project  is  currently  on  track  -  in  the  upcoming  quarter,  we  will  continue  applying  the 
developed  algorithms  to  various  data  sets  and  the  design/implementation  of  the  multiscale  heat 
kernel  coordinates  algorithms.  No  problems  are  currently  anticipated. 
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In  this  quarter,  we  continued  design  and  implementation  of  the  new  multiscale  SVD  (MSVD) 
algorithms.  We  applied  the  MSVD  to  a  publicly  available  LIDAR  dataset  for  the  purposes  of 
distinguishing  between  vegetation  and  the  forest  floor.  The  final  results  are  presented  in  this 
report  (initial  results  were  reported  in  the  previous  quarterly  report). 

The  project  is  currently  on  track  -  in  the  upcoming  quarters,  we  will  continue  applying  the 
developed  algorithms  to  various  data  sets  and  focus  on  the  design  and  development  of  the 
multiscale  heat  kernel  coordinates  algorithms.  No  problems  are  currently  anticipated. 
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3  Introduction 


The  primary  project  effort  over  the  last  quarter  focused  on  completing  design/development  of  the 
multiscale  SVD  algorithms  [1],  Results  from  experiments  conducted  on  a  publicly  available 
LIDAR  dataset  [5]  are  provided  in  Section  5. 


Use  or  disclosure  of  data  contained  on  this  sheet  is  subject  to  restrictions  on  the  title  page  of  this  report. 
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4  Methods,  Assumptions  and  Procedures 


4.1  Multiscale  Singular  Value  Decomposition 

The  Multiscale  Singular  Value  Decomposition  (MSVD)  was  introduced  in  the  earlier  technical 
reports  [6]  [7].  The  MSVD  provides  a  spectral  readout  of  the  dataset  at  all  scales. 

We  applied  the  MSVD  algorithm  to  a  real-world  LIDAR  data  set.  Experimental  setup  and  initial 
results  were  reports  section  5.2  of  the  previous  quarterly  report  [7],  We  present  the  full  results  of 
that  experiment  in  this  report. 

4.2  Deliverables  /  Milestones 


Date 

Deliverables  /  Milestones 

Status 

Oct  2010 

Progress  report  for  period  1,  1st  quarter 

v7 

Jan  2011 

Progress  report  for  period  1,  2nd  quarter  /  complete  randomized  matrix  decompositions  task 

Apr  2011 

Progress  report  for  period  1,  3rd  quarter  /  complete  approximate  nearest  neighbors  task 

Jul  2011 

Progress  report  for  period  1 ,  4th  quarter  /  complete  experiments  -  part  1 

Oct  2011 

Progress  report  for  period  2,  1st  quarter 

Jan  2012 

Progress  report  for  period  2,  2nd  quarter  /  complete  multiscale  SVD  task 

Apr  2012 

Progress  report  for  period  2,  3rd  quarter 

Jul  2012 

Progress  report  for  period  2,  4th  quarter  /  complete  experiments  -  part  2 

Oct  2012 

Progress  report  for  period  3,  1st  quarter 

Jan  2013 

Progress  report  for  period  3,  2nd  quarter  /  complete  multiscale  Heat  Kernel  task 

Apr  2013 

Progress  report  for  period  3,  3rd  quarter 

Jul  2013 

Final  project  report  +  software  +  documentation  on  CDROM  /  complete  experiments  -  part  3 

Use  or  disclosure  of  data  contained  on  this  sheet  is  subject  to  restrictions  on  the  title  page  of  this  report. 
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5  Results  and  Discussion 


We  present  experimental  results  for  the  LIDAR  dataset  below. 

5.1  Experiment:  LIDAR  Dataset  (MSVD  using  nearest  neighbors) 

This  publicly  available  dataset  [5]  contains  3-dimensional  LIDAR  point  data  clouds  representing 
ten  sections  of  riparian  floor  and  vegetation.  An  analysis  of  the  dataset  for  classification  purposes 
is  presented  in  [4],  The  dataset  comprises  639,520  data  points,  each  categorized  as  floor  or 
vegetation.  The  dataset  is  depicted  in  Figure  1 . 

LIDAR  dataset  (639520  data  points) 


Figure  1.  Example  2:  LIDAR  dataset 

Sensitivity  and  specificity  measures  are  used  to  provide  metrics  for  the  classification  task.  These 
are  statistical  measures  used  for  measuring  performance  of  binary  classification  tests  and  are 
akin  to  Type  I  and  Type  II  errors.  Sensitivity  measures  the  proportion  of  actual  positives  which 
are  correctly  identified  as  such.  Specificity  measures  the  proportion  of  negatives  which  are 
correctly  identified.  The  classification  accuracy  reported  in  the  paper  [4]  is  95%  as 
min  { sensitivity,  specificity } . 

The  MSVD  algorithms  were  used  for  the  classification  task  of  binning  each  point  as  floor  or 
vegetation.  For  comparison  purposes,  the  same  training  and  test  data  sets  used  in  reference  paper. 
A  SVM  classifier  (also  used  in  reference  paper)  was  used  to  classify  the  MSVD  features. 
Continuous  and  10-discretization  pre-processing  options  for  the  SVM  were  considered.  For  10- 
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discreitzation,  all  coordinates  were  discretized  by  pushing  data  to  centers  of  10  equal-integral 
areas  of  normal  (0,1)  distribution  in  addition  to  usual  coordinate  scaling  (see  Figure  2).  The  exact 
NN  algorithm  was  used  to  compute  neighbors  at  scales  9,  10  and  11.  Scale  9  provided  roughly 
around  100  to  200  points  in  each  ball.  It  should  be  pointed  out  that  the  reference  paper  takes  deep 
specifics  of  the  problem  (relative  dimensions  of  leaves,  branches  and  soil)  into  account.  In 
contrast,  our  approach  using  the  MSVD  employs  a  generic  methodology  for  data  analysis. 


I  I  I  1 1  1 1 1  I 

Figure  2.  10-discretization  of  scaled  features 

The  original  features  from  the  dataset  are  simply  the  3-dimensional  spatial  coordinates.  Applying 
the  MSVD  algorithm  to  the  dataset,  we  obtain  the  following  derived  features  corresponding  to 
the  singular  values  and  vectors  at  scales  9,  10  and  11.  The  dimension  of  our  derived  feature 
vector  is  36  (3-singular  values  and  3  3-dimensional  singular  vectors  at  each  scale). 
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We  defined  an  experimental  design  to  test  our  intuitive  understanding  that  multiple  scales  most 
likely  will  provide  better  results  than  any  single  scale.  The  experimental  design  in  shown  in 
Figure  3. 
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Figure  3.  Experimental  design  for  LIDAR  dataset 


The  final  results  in  terms  of  specificity  and  sensitivity  for  this  above  design  are  tabulated  in 
Figure  4.  Using  the  MSVD  features  reduces  classification  error  to  2%  (from  5%  in  paper). 
More  importantly,  as  expected  the  combination  of  localized  scales  works  better  than  any 
single  scale.  Further,  scale  1 1  has  lots  of  “empty”  balls  which  explain  the  poor  accuracy. 
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Figure  4.  Sensitivity/specificity  results  for  the  LIDAR  dataset 
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The  project  is  on  track  with  wrapping  up  the  multiscale  SVD  algorithms.  A  comprehensive  paper 
on  the  MSVD  is  forthcoming.  We  presented  the  results  of  using  these  algorithms  on  the  LIDAR 
dataset.  We  will  focus  on  design/implementation  of  the  new  multiscale  heat  kernel  coordinates 
algorithms  and  continue  with  algorithmic  improvements  and  experimentation  using  the 
developed  algorithms  in  the  next  quarter. 

No  problems  are  currently  anticipated. 
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